Enumerating suboptimal alignments of multiple biological sequences efficiently.
نویسندگان
چکیده
The multiple sequence alignment problem is very applicable and important in various fields in molecular biology. Because the optimal alignment that maximizes the score is not always biologically most significant, providing many suboptimal alignments as alternatives for the optimal one is very useful. As for the alignment of two sequences, this suboptimal problem is well-studied, but for the alignment of multiple sequences, it has been considered impossible to investigate such suboptimal alignments because of the enormous size of the problem. The optimal multiple alignment can be obtained with A* algorithm, and an efficient algorithm for the k shortest paths problem on general graphs is discovered recently. We extend these algorithms for computation of set of all aligned groups of residues in optimal and suboptimal alignments, and for enumeration of suboptimal alignments. The suboptimal alignments are numerous. Thus we discuss what kind of suboptimal alignment is unnecessary to enumerate, and propose an efficient technique to enumerate only necessary alignments. The practicality of these algorithms are demonstrated through experiments. Moreover, the property of suboptimal alignments of multiple sequences are also examined through experiments.
منابع مشابه
Enumerating Suboptimal Alignments of Multiple Biological Sequences E ciently
The multiple sequence alignment problem is very applicable and important in various elds in molecular biology. Because the optimal alignment that maximizes the score is not always biologically most signi cant, providing many suboptimal alignments as alternatives for the optimal one is very useful. As for the alignment of two sequences, this suboptimal problem is well-studied 6;9;12 , but for th...
متن کاملOn Suboptimal Alignments of Biological Sequences
It is widely accepted that the optimal alignment between a pair of proteins or nucleic acid sequences that minimizes the edit distance may not necessarily re ect the correct biological alignment. Alignments of proteins based on their structures or of DNA sequences based on evolutionary changes are often di erent from alignments that minimize edit distance. However, in many cases (e.g. when the ...
متن کاملParallel Multiple Sequence Alignment with Decentralized Cache Support
In this paper we present a new method for aligning large sets of biological sequences. The method performs a sequence alignment in parallel and uses a decentralized cache to store intermediate results. The method allows alignments to be recomputed efficiently when new sequences are added or when alignments of different precisions are requested. Our method can be used to solve important biologic...
متن کاملMolecular analysis of AbOmpA type-1 as immunogenic target for therapeutic interventions against MDR Acinetobacter baumannii infection
Introduction: Acinetobacter baumannii is associated with hospital-acquired infections. Outer membrane protein A of A.baumannii (AbOmpA) is a well-characterized virulence factor which has important roles in pathogenesis of this bacterium. Methods: Based on our PCR-sequencing of ompA gene in the clinical isolates, AbOmpA protein can be categorized into two types, named here type-1 and type-2. We ...
متن کاملThe Jalview Java alignment editor
Multiple sequence alignment remains a crucial method for understanding the function of groups of related nucleic acid and protein sequences. However, it is known that automatic multiple sequence alignments can often be improved by manual editing. Therefore, tools are needed to view and edit multiple sequence alignments. Due to growth in the sequence databases, multiple sequence alignments can o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
دوره شماره
صفحات -
تاریخ انتشار 1997